Part 4: tools for reporting
Introduced the basics of Bayesian inference
Estimate models in brms with default priors
How to wrap our head around the priors?
Prior predictive checking
What if you have a model?
Prior Sensitivity Analyses
Posterior predictive checking
Reporting the results (summarizing the posterior)
WritingData.RDataExperimental study on Writing instructions
2 conditions:
Often we rely on ‘arbitrary’ chosen (default) weakly informative priors
What is the influence of the prior (and the likelihood) on our results?
You could ad hoc set new priors and re-run the analyses and compare (a lot of work, without strict sytematical guidelines)
Semi-automated checks can be done with priorsense package
priorsense packageRecently a package dedicated to prior sensitivity analyses is launched
Key-idea: power-scaling (both prior and likelihood)
background reading:
YouTube talk:
First check is done by using the powerscale_sensitivity( ) function
column prior contains info on sensitivity for prior (should be lower than 0.05)
column likelihood contains info on sensitivity for likelihood (that we want to be high, ‘let our data speak’)
column diagnosis is a verbalization of potential problem (- if none)
Sensitivity based on cjs_dist:
# A tibble: 47 × 4
variable prior likelihood diagnosis
<chr> <dbl> <dbl> <chr>
1 b_Intercept 0.00282 0.0448 -
2 b_FirstVersion_GM 0.00201 0.0152 -
3 b_Experimental_condition 0.00170 0.0229 -
4 sd_Class__Intercept 0.00549 0.169 -
5 sd_Class__FirstVersion_GM 0.00411 0.0296 -
6 cor_Class__Intercept__FirstVersion_GM 0.000935 0.171 -
7 sigma 0.00199 0.301 -
8 r_Class[1,Intercept] 0.00116 0.0803 -
9 r_Class[2,Intercept] 0.00114 0.133 -
10 r_Class[3,Intercept] 0.00167 0.123 -
# ℹ 37 more rows
Posterior Predictive Checking
Similar to prior predictive checking
brms300 distributions of simulated datasets (scores on SecondVersion) overlayed by our observed data (to have an anchoring point)
Different ways to summarize our results:
visually
credible intervals (eti & hdi)
rope + hdi rule
hypothesis tests
bayesplot packagemcmc_areas() function
mcmc_areas_ridges() function
mcmc_intervals() function
mcmc_areas() functionGives a posterior distribution including a certain credible interval that you can set manually with the prob argument:
mcmc_areas() functionmcmc_areas_ridges() functionAlmost similar to the previous, only the vertical spacing changes a bit…
Meanwhile, see how you can easily change the color scheme for bayesplot graphs
mcmc_areas_ridges() functionmcmc_intervals() functionSummarizes the posterior as a horizontal bar with identifiers for two CI.
Here we set one for a 50% and one for a 89% CI
mcmc_intervals() functionPowercombo: as_draws_df() + ggplot2 + ggdist
What does as_draws_df() do?
# A tibble: 10 × 4
b_Intercept b_FirstVersion_GM b_Experimental_condition sd_Class__Intercept
<dbl> <dbl> <dbl> <dbl>
1 113. 1.11 3.71 2.91
2 110. 0.989 5.76 3.63
3 111. 0.904 4.47 2.18
4 110. 0.894 6.56 2.84
5 109. 1.12 6.85 2.80
6 112. 0.796 0.773 4.74
7 111. 0.862 4.57 2.46
8 113. 0.875 2.56 3.00
9 111. 1.19 4.62 2.87
10 113. 1.07 2.43 3.50
ggdist geomsggdist package has a set of functions to visualize a distribution
Before we start, set our own plot theme (not so necessary)
We use posterior_PD as a starting point (our draws)
Change the CI’s to 50% and 89%
Let’s make a dotplot… (research shows this is best interpreted) with 100 dots
We use pivot_longer(everything()) to stack information on multiple parameters
Our example: 2 groups according to Experimental_condition
How to visualize the posterior probability of averages for both groups?
posterior_PD %>%
select(
b_Intercept, b_Experimental_condition
) %>%
mutate(
Mean_Control_condition = b_Intercept,
Mean_Experimental_condition = b_Intercept + b_Experimental_condition
) %>%
select(
Mean_Control_condition, Mean_Experimental_condition
) %>%
pivot_longer(everything()) %>%
ggplot(
aes(x = value, color = name, fill = name)
) +
stat_halfeye(
.width = c(.50,.89),
alpha = .40
) +
scale_y_continuous(name = "", breaks = NULL)Code: see separate script called HOP_script.R
# A tibble: 6 × 3
draw X Pred1
<int> <dbl> <dbl>
1 1 -15 96.6
2 1 -14.9 96.7
3 1 -14.8 96.8
4 1 -14.7 96.9
5 1 -14.6 97.0
6 1 -14.5 97.1
To plot differences between classes we can use class-specific residuals:
as_draws_df(M3_custom) %>%
select(ends_with(",Intercept]")) %>%
pivot_longer(starts_with("r_Class")) %>%
mutate(sigma_i = value) %>%
ggplot(aes(x = sigma_i, y = reorder(name, sigma_i))) +
stat_pointinterval(
point_interval = mean_qi,
.width = .89,
size = 1/6) +
scale_y_discrete(expression(italic(j)), breaks = NULL) +
labs(x = expression(italic(u)[italic(j)])) +
coord_flip()head(
as_draws_df(M3_custom) %>%
mutate(
ICC = (sd_Class__Intercept^2/(sd_Class__Intercept^2 + sigma^2))) %>%
select(sigma, sd_Class__Intercept, ICC),
5
) # A tibble: 5 × 3
sigma sd_Class__Intercept ICC
<dbl> <dbl> <dbl>
1 7.84 2.91 0.121
2 7.52 3.63 0.189
3 7.57 2.18 0.0766
4 7.51 2.84 0.125
5 7.80 2.80 0.114
as_draws_df(M3_custom) %>%
mutate(
ICC = (sd_Class__Intercept^2/(sd_Class__Intercept^2 + sigma^2))
) %>%
select(ICC) %>%
ggplot(aes(x = ICC)) +
stat_halfeye(
aes(fill = after_stat(level)),
.width = c(.50,.89,.99)
) +
scale_fill_brewer(na.translate = F
) +
scale_x_continuous("Intra Class Correlation",
breaks = seq(.00,.60,by =.05)) +
scale_y_continuous("", breaks = NULL) +
labs(
fill = "Credible Interval"
)Code: see separate script called HOP_MixedEffects_script.R
ROPE: Region Of Practical Equivalence
Set a region of parameter values that can be considered equivalent to having no effect
in standard effect sizes the advised default is a range of -0.1 to 0.1
this equals 1/2 of a small effect size (Cohen, 1988)
all parameter values in that range are set equal to no effect
ROPE + HDI rule
95% of HDI out of ROPE \(\rightarrow\) \(H_0\) rejected
95% of HDI all in ROPE \(\rightarrow\) \(H_0\) not rejected
95% of HDI partially out of ROPE \(\rightarrow\) undecided
bayestestR packageWe can use the equivalence_test() function of the bayestestR package
# Test for Practical Equivalence
ROPE: [-1.68 1.68]
Parameter | H0 | inside ROPE | 95% HDI
-----------------------------------------------------------------
Intercept | Rejected | 0.00 % | [109.27 113.52]
FirstVersion_GM | Accepted | 100.00 % | [ 0.49 1.39]
Experimental_condition | Rejected | 0.00 % | [ 1.70 7.17]
We visualize the equivalence_test() by adding plot( )
parameters packagelibrary(parameters)
model_parameters(
M3_custom,
ci_method = "hdi",
rope_range = c(-1.8,1.8), #sd MarathonTimeM = 17.76 so 17.76*0.1
test = c("rope", "pd")
)# Fixed Effects
Parameter | Median | 95% CI | pd | % in ROPE | Rhat | ESS
-----------------------------------------------------------------------------------------
(Intercept) | 111.40 | [109.17, 113.42] | 100% | 0% | 1.001 | 2095.00
FirstVersion_GM | 0.93 | [ 0.53, 1.42] | 100% | 100% | 1.001 | 1911.00
Experimental_condition | 4.51 | [ 1.74, 7.20] | 99.83% | 0.45% | 1.000 | 2427.00
# Sigma
Parameter | Median | 95% CI | pd | % in ROPE | Rhat | ESS
----------------------------------------------------------------------
sigma | 7.67 | [7.20, 8.22] | 100% | 0% | 0.999 | 6783.00
https://www.bayesrulesbook.com/
In Nature Reviews
If you like running - like I do - this could be a great companion on your run!
There are many blogs and websites that you can consult if you want to find out more about making graphs.
One that I often fall back to is:
Do not hesitate to contact me!